Search CORE

38 research outputs found

Detecting common copy number variants in high-throughput sequencing data by using JointSLM algorithm

Author: Benelli Matteo
Magi Alberto
Roviello Franco
Torricelli Francesca
Yoon Seungtai
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

The discovery of genomic structural variants (SVs), such as copy number variants (CNVs), is essential to understand genetic variation of human populations and complex diseases. Over recent years, the advent of new high-throughput sequencing (HTS) platforms has opened many opportunities for SVs discovery, and a very promising approach consists in measuring the depth of coverage (DOC) of reads aligned to the human reference genome. At present, few computational methods have been developed for the analysis of DOC data and all of these methods allow to analyse only one sample at time. For these reasons, we developed a novel algorithm (JointSLM) that allows to detect common CNVs among individuals by analysing DOC data from multiple samples simultaneously. We test JointSLM performance on synthetic and real data and we show its unprecedented resolution that enables the detection of recurrent CNV regions as small as 500 bp in size. When we apply JointSLM to analyse chromosome one of eight genomes with different ancestry, we identify 3000 regions with recurrent CNVs of different frequency and size: hierarchical clustering on these regions segregates the eight individuals in two groups that reflect their ancestry, demonstrating the potential utility of JointSLM for population genetics studies

Archivio della Ricerca - Università degli Studi di Siena

Florence Research

PubMed Central

Inferring Haplotypes of Copy Number Variations From High-Throughput Data With Uncertainty

Author: Hosono Naoya
Kato Mamoru
Leotta Anthony
Sebat Jonathan
Tsunoda Tatsuhiko
Yoon Seungtai
Zhang Michael Q.
Publication venue: Genetics Society of America
Publication date
Field of study

Accurate information on haplotypes and diplotypes (haplotype pairs) is required for population-genetic analyses; however, microarrays do not provide data on a haplotype or diplotype at a copy number variation (CNV) locus; they only provide data on the total number of copies over a diplotype or an unphased sequence genotype (e.g., AAB, unlike AB of single nucleotide polymorphism). Moreover, such copy numbers or genotypes are often incorrectly determined when microarray signal intensities derived from different copy numbers or genotypes are not clearly separated due to noise. Here we report an algorithm to infer CNV haplotypes and individuals’ diplotypes at multiple loci from noisy microarray data, utilizing the probability that a signal intensity may be derived from different underlying copy numbers or genotypes. Performing simulation studies based on known diplotypes and an error model obtained from real microarray data, we demonstrate that this probabilistic approach succeeds in accurate inference (error rate: 1–2%) from noisy data, whereas previous deterministic approaches failed (error rate: 12–18%). Applying this algorithm to real microarray data, we estimated haplotype frequencies and diplotypes in 1486 CNV regions for 100 individuals. Our algorithm will facilitate accurate population-genetic analyses and powerful disease association studies of CNVs

Crossref

PubMed Central

Mixture modeling of microarray gene expression data

Author: Ahn Kwangmi
Finch Stephen J
Gordon Derek
Kim Wonkuk
Lee Jung Yeon
Mao Wenyang
Mendell Nancy R
Tashman Adam P
Yang Yang
Yoon Seungtai
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

About 28% of genes appear to have an expression pattern that follows a mixture distribution. We use first- and second-order partial correlation coefficients to identify trios and quartets of non-sex-linked genes that are highly associated and that are also mixtures. We identified 18 trio and 35 quartet mixtures and evaluated their mixture distribution concordance. Concordance was defined as the proportion of observations that simultaneously fall in the component with the higher mean or simultaneously in the component with the lower mean based on their Bayesian posterior probabilities. These trios and quartets have a concordance rate greater than 80%. There are 33 genes involved in these trios and quartets. A factor analysis with varimax rotation identifies three gene groups based on their factor loadings. One group of 18 genes has a concordance rate of 56.7%, another group of 8 genes has a concordance rate of 60.8%, and a third group of 7 genes has a concordance rate of 69.6%. Each of these rates is highly significant, suggesting that there may be strong biological underpinnings for the mixture mechanisms of these genes. Bayesian factor screening confirms this hypothesis by identifying six single-nucleotide polymorphisms that are significantly associated with the expression phenotypes of the five most concordant genes in the first group

Cold Spring Harbor Laboratory Institutional Repository

PubMed Central

A Bayesian approach for applying Haseman-Elston methods

Author: Bmc Genetics
Courtney Gray
Ellen L Goode
Kenny Qian Ye
Kenny Qian Ye
Lynn Goldin
Nancy Role Mendell
Seungtai Yoon
Young Ju Suh
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

The main goal of this paper is to couple the Haseman-Elston method with a simple yet effective Bayesian factor-screening approach. This approach selects markers by considering a set of multigenic models that include epistasis effects. The markers are ranked based on their marginal posterior probability. A significant improvement over our previously proposed Bayesian variable selection methodology is a simple Metropolis-Hasting algorithm that requires minimum tuning on the prior settings. The algorithm, however, is also flexible enough for us to easily incorporate our hypotheses and avoid computational pitfalls. We apply our approach to the microsatellite data of Collaborative Studies on Genetics of Alcoholism using the coded values for the ALDX1 variable as our response

CiteSeerX

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Principal components ancestry adjustment for Genetic Analysis Workshop 17 data

Author: A Dasgupta
AL Price
C Dering
CD Campbell
Eun Jung Yoon
Jane E Cerise
Jing Jin
Nancy R Mendell
S Purcell
Seungtai Yoon
SJ Kang
Stephen J Finch
Sun Jung Kang
X Zhu
X Zhu
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Statistical tests on rare variant data may well have type I error rates that differ from their nominal levels. Here, we use the Genetic Analysis Workshop 17 data to estimate type I error rates and powers of three models for identifying rare variants associated with a phenotype: (1) by using the number of minor alleles, age, and smoking status as predictor variables; (2) by using the number of minor alleles, age, smoking status, and the identity of the population of the subject as predictor variables; and (3) by using the number of minor alleles, age, smoking status, and ancestry adjustment using 10 principal component scores. We studied both quantitative phenotype and a dichotomized phenotype. The model with principal component adjustment has type I error rates that are closer to the nominal level of significance of 0.05 for single-nucleotide polymorphisms (SNPs) in noncausal genes for the selected phenotype than the model directly adjusting for population. The principal component adjustment model type I error rates are also closer to the nominal level of 0.05 for noncausal SNPs located in causal genes for the phenotype. The power for causal SNPs with the principal component adjustment model is comparable to the power of the other methods. The power using the underlying quantitative phenotype is greater than the power using the dichotomized phenotype

Crossref

Springer - Publisher Connector

PubMed Central

Rates of contributory de novo mutation in high and low-risk autism families.

Author: Andrews Peter
Baldwin Kristin K
Buja Andreas
Iossifov Ivan
Krieger Abba M
Lee Yoon-Ha
Levy Dan
Marks Steven
Munoz Adriana
Pradhan Kith
Reeves Catherine
Ronemus Michael
Wang Zihua
Wigler Michael
Winterkorn Lara
Yamrom Boris
Yoon Seungtai
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2021
Field of study

Autism arises in high and low-risk families. De novo mutation contributes to autism incidence in low-risk families as there is a higher incidence in the affected of the simplex families than in their unaffected siblings. But the extent of contribution in low-risk families cannot be determined solely from simplex families as they are a mixture of low and high-risk. The rate of de novo mutation in nearly pure populations of high-risk families, the multiplex families, has not previously been rigorously determined. Moreover, rates of de novo mutation have been underestimated from studies based on low resolution microarrays and whole exome sequencing. Here we report on findings from whole genome sequence (WGS) of both simplex families from the Simons Simplex Collection (SSC) and multiplex families from the Autism Genetic Resource Exchange (AGRE). After removing the multiplex samples with excessive cell-line genetic drift, we find that the contribution of de novo mutation in multiplex is significantly smaller than the contribution in simplex. We use WGS to provide high resolution CNV profiles and to analyze more than coding regions, and revise upward the rate in simplex autism due to an excess of de novo events targeting introns. Based on this study, we now estimate that de novo events contribute to 52-67% of cases of autism arising from low risk families, and 30-39% of cases of all autism

Cold Spring Harbor Laboratory Institutional Repository

Directory of Open Access Journals

PubMed Central

Microduplications of 16p11.2 are associated with schizophrenia

Recurrent microdeletions and microduplications of a 600 kb genomic region of chromosome 16p11.2 have been implicated in childhood-onset developmental disorders1-3. Here we report the strong association of 16p11.2 microduplications with schizophrenia in two large cohorts. In the primary sample, the microduplication was detected in 12/1906 (0.63%) cases and 1/3971 (0.03%) controls (P=1.2×10-5, OR=25.8). In the replication sample, the microduplication was detected in 9/2645 (0.34%) cases and 1/2420 (0.04%) controls (P=0.022, OR=8.3). For the series combined, microduplication of 16p11.2 was associated with 14.5-fold increased risk of schizophrenia (95% C.I. [3.3, 62]). A meta-analysis of multiple psychiatric disorders showed a significant association of the microduplication with schizophrenia, bipolar disorder and autism. The reciprocal microdeletion was associated only with autism and developmental disorders. Analysis of patient clinical data showed that head circumference was significantly larger in patients with the microdeletion compared with patients with the microduplication (P = 0.0007). Our results suggest that the microduplication of 16p11.2 confers substantial risk for schizophrenia and other psychiatric disorders, whereas the reciprocal microdeletion is associated with contrasting clinical features

Carolina Digital Repository

Patterns and rates of exonic de novo mutations in autism spectrum disorders

Author: A McKenna
Andrew Kirby
Aniko Sabo
Avi Ma’ayan
Benjamin F. Voight
Benjamin M. Neale
Bernie Devlin
BJ O’Roak
BM Neale
Braden E. Boone
C Betancur
Catalina Betancur
Chad Schafer
Chiao-Feng Lin
Christine Stevens
D Pinto
DF Conrad
Donna Muzny
Edwin H. Cook Jr
EJ Rossin
Elaine Lim
Elizabeth Rossin
Emily L. Crawford
Eric Banks
Eric Boerwinkle
Evan T. Geller
Gerard D. Schellenberg
Guiqing Cai
GV Kryukov
H Li
Han Liu
IA Adzhubei
Irene Newsham
J Hallmayer
J Sebat
Jack R. Wimbish
James S. Sutcliffe
Jared Maguire
Jason Flannick
Jayon Lihm
Jeffrey G. Reid
JF Crow
Joseph D. Buxbaum
K Lage
Kaitlin E. Samocha
Kathryn Roeder
Khalid Shakir
Kiran Garimella
Li Liu
Li-San Wang
Lora Lewis
MA DePristo
Mark DePristo
Mark J. Daly
MC Wu
Menachem Fromer
Nicholas G. Campbell
Omar Jabado
Otto Valladares
P Lichtenstein
Paz Polak
Richard A. Gibbs
Ruth Dannenfelser
Ryan Poplin
Seungtai Yoon
Shamil Sunyaev
Shawn E. Levy
SJ Sanders
Stacey Gabriel
Tim Fennell
Tuo Zhao
Uma Nagaswamy
Vladimir Makarov
Yan Kou
Yi Han
Yuanqing Wu
Zuleyma Peralta
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/04/2012
Field of study

Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified1,2. To identify further genetic risk factors, we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n= 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant and the overall rate of mutation is only modestly higher than the expected rate. In contrast, there is significantly enriched connectivity among the proteins encoded by genes harboring de novo missense or nonsense mutations, and excess connectivity to prior ASD genes of major effect, suggesting a subset of observed events are relevant to ASD risk. The small increase in rate of de novo events, when taken together with the connections among the proteins themselves and to ASD, are consistent with an important but limited role for de novo point mutations, similar to that documented for de novo copy number variants. Genetic models incorporating these data suggest that the majority of observed de novo events are unconnected to ASD, those that do confer risk are distributed across many genes and are incompletely penetrant (i.e., not necessarily causal). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5 to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favor of CHD8 and KATNAL2 as genuine autism risk factors

Crossref

Harvard University - DASH

HAL-Inserm

HAL Descartes

PubMed Central

Microduplications of 16p11.2 are associated with schizophrenia

Author: Addington Anjene M.
Bhandari Abhishek
Chitkara Nisha
Christian Susan L.
Cichon Sven
Craddock Nick
Crow Timothy J.
DeLisi Lynn E.
DeRosse Pamela
Deutsch Curtis K.
Dickel Diane E.
Gallagher Louise
Ganesh Jaya
Gary Sydney
Gill Michael
Goodell Meredith
Grozeva Detelina
Haldeman-Englert Chad
Iakoucheva Lilia M
Kaplan Paige
Kassem Layla
Kendall Jude
King Mary-Claire
Kirov George
Krantz Ian D.
Krastoshevsky Olga
Krause Verena
Kumar Ravinesh A.
Kusenda Mary
Kustanovich Vlad
Lajonchere Clara M.
Lakshmi B.
Lee Yoon-ha
Lehtimäki Terho
Leibenluft Ellen
Leotta Anthony
Levy Deborah L.
Lieberman Jeffrey A.
Makarov Vladimir
Malhotra Anil K.
Malhotra Dheeraj
McCarthy Shane E.
McClellan Jon
McMahon Francis J.
Mendell Nancy R.
Nöthen Markus M.
Owen Michael J.
O’Donovan Michael C.
Pavon Kevin
Pearl Justin
Perkins Diana
Potash James B.
Puura Kaija
Rapoport Judith
Rietschel Marcella
Roccanova Patricia
Schulze Thomas G.
Sebat Jonathan
Shaikh Tamim H.
Skuse David
Spinner Nancy B.
Steele Jo
Stroup T. Scott
Sullivan Patrick
Susser Ezra
Sutcliffe James S.
Vacic Vladimir
Walsh Tom
Wellcome Trust Case Control Consortium
Willour Virginia L.
Wolff Jessica
Yoon Seungtai
Zackai Elaine H.
Publication venue
Publication date: 01/01/2009
Field of study

Recurrent microdeletions and microduplications of a 600-kb genomic region of chromosome 16p11.2 have been implicated in childhood-onset developmental disorders1,2,3. We report the association of 16p11.2 microduplications with schizophrenia in two large cohorts. The microduplication was detected in 12/1,906 (0.63%) cases and 1/3,971 (0.03%) controls (P = 1.2 × 10−5, OR = 25.8) from the initial cohort, and in 9/2,645 (0.34%) cases and 1/2,420 (0.04%) controls (P = 0.022, OR = 8.3) of the replication cohort. The 16p11.2 microduplication was associated with a 14.5-fold increased risk of schizophrenia (95% CI (3.3, 62)) in the combined sample. A meta-analysis of datasets for multiple psychiatric disorders showed a significant association of the microduplication with schizophrenia (P = 4.8 × 10−7), bipolar disorder (P = 0.017) and autism (P = 1.9 × 10−7). In contrast, the reciprocal microdeletion was associated only with autism and developmental disorders (P = 2.3 × 10−13). Head circumference was larger in patients with the microdeletion than in patients with the microduplication (P = 0.0007)

Carolina Digital Repository

Mapping copy number variation by population-scale genome sequencing

Author: Adrian M. Stütz
AJ Iafrate
AJ Sharp
Alexander Eckehart Urban
Alexej Abyzov
Asif Chinwalla
AW Pang
Aylwyn Scally
C Alkan
Can Alkan
Chang-Yun Lin
Charles Lee
Chip Stewart
CJ Willer
D Pinto
DA Hinds
Deniz Kural
DF Conrad
DF Conrad
DM Altshuler
Donald F. Conrad
DY Chiang
E Tuzun
Ekta Khurana
Evan E. Eichler
F Hormozdiari
Fabian Grubert
Fereydoun Hormozdiari
Gabor T. Marth
Gil McVean
H Stefansson
Heather E. Peckham
Hugo Y. K. Lam
HY Lam
I Hajirasouliha
Iman Hajirasouliha
J Harrow
J Sebat
J Sebat
JA Bailey
JA Lee
James Nemesh
Jan O. Korbel
Jeffrey M. Kidd
Jerilyn A. Walker
Jiantao Wu
Jing Leng
JM Kidd
JO Korbel
Jonathan Sebat
Joshua Korn
JR Lupski
JT Simpson
Jun Wang
K Chen
K Ye
Kai Ye
Ken Chen
Kenny Ye
KJ McKernan
Klaudia Walter
Li Ding
Lilia M. Iakoucheva
Mark A. Batzer
Mark B. Gerstein
Matthew E. Hurles
Michael P. Stromberg
Michael Snyder
Miriam K. Konkel
N Craddock
P Medvedev
P Stankiewicz
PH Sudmant
PJ Campbell
PJ Hastings
R Li
R. Keira Cheetham
RE Mills
Robert E. Handsaker
Ruibang Luo
Ruiqiang Li
Ryan E. Mills
S Lee
S Levy
S Yoon
SA McCarroll
SA McCarroll
SE McCarthy
Seungtai Chris Yoon
Shuli Kang
Steven A. McCarroll
Tobias Rausch
Xinghua Shi
Xinmeng Jasmine Mu
Y Hasin-Brumshtein
Yingrui Li
Yujun Zhang
Yutao Fu
Zamin Iqbal
Zhengdong D. Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref